Detecting laughter and filled pauses using syllable-based features
نویسندگان
چکیده
Identifying laughter and filled pauses is important to understanding spontaneous human speech. These are two common vocal expressions that are non-lexical and incredibly communicative. In this paper, we use a two-tiered system for identifying laughter and filled pauses. We first generate frame level hypotheses and subsequently rescore these based on features derived from acoustic syllable segmentation. Using Interspeech 2013 ComParE challenge corpus, SVC, we find that these rescoring experiments and inclusion of syllable based acoustic/prosodic features allow for the detection of laughter and filled pauses by at 89.3% UAAUC on the development set, an improvement of 1.7% over the challenge baseline.
منابع مشابه
Exploring the Body and Head Kinematics of Laughter, Filled Pauses and Breaths
We present ongoing work in the DUEL project, which focuses on the study of disfluencies, exclamations, and laughter in dialogue. Here we focus on the multimodal aspects of disfluent vocalizations, namely laughter and laughed speech, filled pauses, and breathing noises. We exemplify these phenomena in the rich multimodal Dream Apartment Corpus, a natural dialogue corpus, which, in addition to co...
متن کاملDetecting Filled Pauses in Tutorial Dialogs
As dialog systems become more capable, users tend to talk more spontaneously and less formally. Spontaneous speech includes features which convey information about the user’s state. In particular, filled pauses, such as um and uh, can indicate that the user is having trouble, wants more time, wants to hold the floor, or is uncertain. In this paper we present a first study of the acoustic charac...
متن کاملA Feature-based Filled Pause Detection System for Dutch
Nowadays, automatic speech recognizers have become quite good in recognizing well prepared fluent speech (e.g. news readings). However, the recognition of unprepared or spontaneous speech is still problematic. Some important reasons for this are that spontaneous speech is less articulated, exhibits a high speaking rate and usually contains a lot of disfluencies. The latter occur when the speake...
متن کاملDetection of nonverbal vocalizations using Gaussian mixture models: looking for fillers and laughter in conversational speech
In this paper, we analyze acoustic profiles of fillers (i.e. filled pauses, FPs) and laughter with the aim to automatically localize these nonverbal vocalizations in a stream of audio. Among other features, we use voice quality features to capture the distinctive production modes of laughter and spectral similarity measures to capture the stability of the oral tract that is characteristic for F...
متن کاملCombining acoustic and visual features to detect laughter in adults' speech
Laughter can not only convey the affective state of the speaker but also be perceived differently based on the context in which it is used. In this paper, we focus on detecting laughter in adults’ speech using the MAHNOB laughter database. The paper explores the use of novel long-term acoustic features to capture the periodic nature of laughter and the use of computer vision-based smile feature...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013